On Learning and Covering Structured Distributions

نویسندگان

  • Gautam Kamath
  • Constantinos Daskalakis
  • Leslie A. Kolodziejski
چکیده

We explore a number of problems related to learning and covering structured distributions. Hypothesis Selection: We provide an improved and generalized algorithm for selecting a good candidate distribution from among competing hypotheses. Namely, given a collection of N hypotheses containing at least one candidate that is ε-close to an unknown distribution, our algorithm outputs a candidate which is O(ε)-close to the distribution. The algorithm requires O(logN/ε) samples from the unknown distribution and O(N logN/ε) time, which improves previous such results (such as the Scheffé estimator) from a quadratic dependence of the running time on N to quasilinear. Given the wide use of such results for the purpose of hypothesis selection, our improved algorithm implies immediate improvements to any such use. Proper Learning Gaussian Mixture Models: We describe an algorithm for properly learning mixtures of two single-dimensional Gaussians without any separability assumptions. Given ?̃?(1/ε) samples from an unknown mixture, our algorithm outputs a mixture that is ε-close in total variation distance, in time ?̃?(1/ε). Our sample complexity is optimal up to logarithmic factors, and significantly improves upon both Kalai et al. [40], whose algorithm has a prohibitive dependence on 1/ε, and Feldman et al. [33], whose algorithm requires bounds on the mixture parameters and depends pseudo-polynomially in these parameters. Covering Poisson Multinomial Distributions: We provide a sparse ε-cover for the set of Poisson Multinomial Distributions. Specifically, we describe a set of n 3)(k/ε)poly(k/ε) distributions such that any Poisson Multinomial Distribution of size n and dimension k is ε-close to a distribution in the set. This is a significant sparsification over the previous best-known ε-cover due to Daskalakis and Papadimitriou [24], which is of size n, where f is polynomial in 1/ε and exponential in k. This cover also implies an algorithm for learning Poisson Multinomial Distributions with a sample complexity which is polynomial in k, 1/ε and log n.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Impact of Structured Input-based Tasks on L2 Learners’ Grammar Learning

Abstract Task-based language teaching has received increased attention in second language research. However, the combination of structured input-based approach and task-based language teaching has not been examined in relation to L2 grammar learning. To address this gap, the present study investigated how the structured input-based tasks with and without explicit information impacted learners’ ...

متن کامل

The Impact of Structured Input-based Tasks on L2 Learners’ Grammar Learning

Abstract Task-based language teaching has received increased attention in second language research. However, the combination of structured input-based approach and task-based language teaching has not been examined in relation to L2 grammar learning. To address this gap, the present study investigated how the structured input-based tasks with and without explicit information impacted learners’ ...

متن کامل

Probability Distributions over Structured Spaces

Our goal is to develop general-purpose techniques for probabilistic reasoning and learning in structured spaces. These spaces are characterized by complex logical constraints on what constitutes a possible world. We propose a tractable formalism, called probabilistic sentential decision diagrams, and show it effectively learns structured probability distributions in two applications: product co...

متن کامل

A learning algorithm for structured character pattern representation used in online recognition of handwritten Japanese characters

structural informations SCPR dictionary character patterns Figure 1. SCPR dictionary. Abstract This paper describes a prototype learning algorithm for structured character pattern representation with common subpatterns shared among multiple character templates for on-line recognition of handwritten Japanese characters. Although prototype learning algorithms have been proved useful for an unstru...

متن کامل

Sparse Structured Principal Component Analysis and Model Learning for Classification and Quality Detection of Rice Grains

In scientific and commercial fields associated with modern agriculture, the categorization of different rice types and determination of its quality is very important. Various image processing algorithms are applied in recent years to detect different agricultural products. The problem of rice classification and quality detection in this paper is presented based on model learning concepts includ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014